Spatial, Structural and Temporal Feature Learning for Human Interaction Prediction
نویسندگان
چکیده
Predicting an interaction before it is fully executed is very important in applications such as human-robot interaction and video surveillance. In a two-human interaction scenario, there often contextual dependency structure between the global interaction context of the two humans and the local context of the different body parts of each human. In this paper, we propose to learn the structure of the interaction contexts, and combine it with the spatial and temporal information of a video sequence for a better prediction of the interaction class. The structural models, including the spatial and the temporal models, are learned with Long Short Term Memory (LSTM) networks to capture the dependency of the global and local contexts of each RGB frame and each optical flow image, respectively. LSTM networks are also capable of detecting the key information from the global and local interaction contexts. Moreover, to effectively combine the structural models with the spatial and temporal models for interaction prediction, a ranking score fusion method is also introduced to automatically compute the optimal weight of each model for score fusion. Experimental results on the BITInteraction and the UT-Interaction datasets clearly demonstrate the benefits of the proposed method.
منابع مشابه
Privacy Spatial and Temporal Distances in Nomadic Settelments
Human always in interaction with their social environment, have consider some degree of privacy with different purposes, for themselves, the people around them and carry out their activities. Creating privacy depends on two elements; subjective meanings that ruling the creation of privacy, and the second sentence are person available facilities. Privacy is not seen, heard, smelled and availabil...
متن کاملPrediction of Protein Sub-Mitochondria Locations Using Protein Interaction Networks
Background: Prediction of the protein localization is among the most important issues in the bioinformatics that is used for the prediction of the proteins in the cells and organelles such as mitochondria. In this study, several machine learning algorithms are applied for the prediction of the intracellular protein locations. These algorithms use the features extracted from pro...
متن کاملHand Gesture Recognition from RGB-D Data using 2D and 3D Convolutional Neural Networks: a comparative study
Despite considerable enhances in recognizing hand gestures from still images, there are still many challenges in the classification of hand gestures in videos. The latter comes with more challenges, including higher computational complexity and arduous task of representing temporal features. Hand movement dynamics, represented by temporal features, have to be extracted by analyzing the total fr...
متن کاملتوالی معنادار فعالیتها در مسکن، مطالعۀ موردی ایل قشقایی
Sequence of activities has evident and objective aspects in the people living environment that depend on subjective and meaningful aspects in their culture and lifestyle. The sequence of activities with two forms of "spatial and temporal", are ways of separation or aggregation of activities in different cultures dwelling, which lie at the origin of the formation of behaviour settings. The theor...
متن کاملMultivariate Feature Extraction for Prediction of Future Gene Expression Profile
Introduction: The features of a cell can be extracted from its gene expression profile. If the gene expression profiles of future descendant cells are predicted, the features of the future cells are also predicted. The objective of this study was to design an artificial neural network to predict gene expression profiles of descendant cells that will be generated by division/differentiation of h...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1608.05267 شماره
صفحات -
تاریخ انتشار 2016